import Warning from 'components/Markdown/Warning'

export const meta = {
  title: "Data Import & Export",
  position: 220,
}

## Data import

Data to be imported needs to adhere to the [Normalized Data Format](#normalized-data-format) (NDF). As of today, the conversion from any concrete data source (like MySQL, MongoDB or Firebase) to NDF must be performed manually. In the [future](https://github.com/graphcool/framework/issues/1410), the Prisma CLI will support importing from these data sources directly.

Here is a general overview of the data import process:

```
+--------------+                    +----------------+                       +------------+
| +--------------+                  |                |                       |            |
| |            | |                  |                |                       |            |
| | SQL        | |  (1) transform   |      NDF       |  (2) chunked upload   |   Prisma   |
| | MongoDB    | | +--------------> |                | +-------------------> |            |
| | JSON       | |                  |                |                       |            |
| |            | |                  |                |                       |            |
+--------------+ |                  +----------------+                       +------------+
  +--------------+
```

As mentioned above, step 1 has to be performed manually. Step 2 can then be done by either using the raw import API or the `prisma import` command from the CLI.

> To view the current state of supported transformations in the CLI and submit a vote for the one you need, you can check out [this](https://github.com/graphcool/framework/issues/1410) GitHub issue.

When uploading files in NDF, you need to provide the import data split across three different _kinds_ of files:

- `nodes`: Data for individual nodes (i.e. databases records)
- `lists`: Data for a list of nodes
- `relations`: Data for related nodes

You can upload an unlimited number of files for each of these types, but it's recommended that each file should be at most 1 MB large. Otherwise you might run into timeouts.

### Important considerations

#### General constraints

When importing data, the `id` field of a node can be at most 25 characters long.

#### Idempotency

Note that import operations are **not idempotent**. This means running an import **always adds** data to your service. It **never updates** existing nodes. This means importing the same dataset multiple times will lead to undefined behaviour. 

For example, importing a node with the same `id` more than once will lead to undefined behaviour and likely break your service!

#### Data Validation

The [import API](data-import-using-the-raw-import-api) does not perform any validation checks on the data to be imported. When using the [CLI](#data-import-with-the-cli) to import data, basic validation checks are executed.

Importing invalid data leads to undefined behaviour and might break your service! As the service maintainer, you are responsible to ensure the validity of the imported data.

**Tip**: A good way to ensure valid data when importing is inspecting the data of a previous [export](#data-export) on a service with an identical data model.

### Data import with the CLI

The Prisma CLI offers the `prisma import` command. It accepts one option:

- `--data` (short: `-d`): A file path to a directory containing the data to be imported (this can either be _regular_ or a _zipped_ directory)

Under the hood, the CLI uses the import API that's described in the next section. However, using the CLI provides some major benefits:

- uploading **multiple files** at once (rather than having to upload each file individually)
- **leveraging the CLI's authentication mechanism** (i.e. you don't need to manually send your authentication token)
- ability to **pause and resume** an ongoing import
- **import from various data sources** like MySQL, MongoDB or Firebase (_not available yet_)

#### Input format

When importing data using the CLI, the files containing the data in NDF need to be located in directories called after their type: `nodes`, `lists` and `relations`.

NDF files are JSON files following a specific structure, so each file containing import data needs to end on `.json`. When placed in their respective directories (`nodes`, `lists` or `relations`), the `.json`-files need to be numbered incrementally, starting with 1, e.g. `1.json`. The file name can be prepended with any number of zeros, e.g. `01.json` or `0000001.json`.

#### Example

Consider the following file structure defining a Prisma service:

```
.
├── data
│   ├── lists
│   │   ├── 0001.json
│   │   ├── 0002.json
│   │   └── 0003.json
│   ├── nodes
│   │   ├── 0001.json
│   │   └── 0002.json
│   └── relations
│       └── 0001.json
├── datamodel.graphql
└── prisma.yml
```

`data` contains the files data to be imported. Further, all files ending on `.json` are adhering to NDF. To import the data from these files, you can simply run the following command in the terminal:

```sh
prisma import --data data
```

### Data import using the raw import API

The raw import API is exposed under the `/import` path of your service's HTTP endpoint. For example:

- `http://localhost:4466/my-app/dev/import`
- `https://eu1.prisma.sh/my-app/prod/import`

One request can upload JSON data (in NDF) of at most 10 MB in size. Note that you need to provide your authentication token in the HTTP `Authorization` header of the request!

Here is an example `curl` command for uploading some JSON data (of NDF type `nodes`):

```sh
curl 'http://localhost:4466/my-app/dev/import' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJPbmxpbmUgSldUIEJ1aWxkZXIiLCJpYXQiOjE1MTM1OTQzMTEsImV4cCI6MTU0NTEzMDMxMSwiYXVkIjasd3d3LmV4YW1wbGUuY29tIiwic3ViIjoianJvY2tldEBleGFtcGxlLmNvbSIsIkdpdmVuTmFtZSI6IkpvaG5ueSIsIlN1cm5hbWUiOiJSb2NrZXQiLCJFbWFpbCI6Impyb2NrZXRAZXhhbXBsZS5jb20iLCJSb2xlIjpbIk1hbmFnZXIiLCJQcm9qZWN0IEFkbWluaXN0cmF0b3IiXX0.L7DwH7vIfTSmuwfxBI82D64DlgoLBLXOwR5iMjZ_7nI' \
-d '{"valueType":"nodes","values":[{"_typeName":"Model0","id":"0","a":"test","b":0,"createdAt":"2017-11-29 14:35:13"},{"_typeName":"Model1","id":"1","a":"test","b":1},{"_typeName":"Model2","id":"2","a":"test","b":2,"createdAt":"2017-11-29 14:35:13"},{"_typeName":"Model0","id":"3","a":"test","b":3},{"_typeName":"Model3","id":"4","a":"test","b":4,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"},{"_typeName":"Model3","id":"5","a":"test","b":5},{"_typeName":"Model3","id":"6","a":"test","b":6},{"_typeName":"Model4","id":"7"},{"_typeName":"Model4","id":"8","string":"test","int":4,"boolean":true,"dateTime":"1015-11-29 14:35:13","float":13.333,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"},{"_typeName":"Model5","id":"9","string":"test","int":4,"boolean":true,"dateTime":"1015-11-29 14:35:13","float":13.333,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"}]}' \
-sSv
```

The generic version for `curl` (using placeholders) would look as follows:

```sh
curl '__SERVICE_ENDPOINT__/import' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer __JWT_AUTH_TOKEN__' \
-d '{"valueType":"__NDF_TYPE__","values": __DATA__ }' \
-sSv
```


## Data export

Exporting data can be done either using the CLI or the raw export API. In both cases, the downloaded data is formatted in JSON and adheres to the Normalized Data Format (NDF). As the exported data is in NDF, it can directly be imported into a service with an identical schema. This can be useful when test data is needed for a service, e.g. in a `dev` stage.

### Data export with the CLI

The Prisma CLI offers the `prisma export` command. It accepts one option:

- `--export-path` (short: `-e`): A file path to a .zip-directory which will be created by the CLI and where the exported data is stored

Under the hood, the CLI uses the export API that's described in the next section. However, using the CLI provides some major benefits:

- **leveraging the CLI's authentication mechanism** (i.e. you don't need to manually send your authentication token)
- **writing downloaded data directly to file system**
- **cursor management** in case multiple requests are needed to export all application data (when doing this manually you need to send multiple requests and adjust the cursor upon each)

#### Output format

The data is exported in NDF and will be placed in three directories that are named after the different NDF types: `nodes`, `lists` and `relations`.

### Data export using the raw export API

The raw export API is exposed under the `/export` path of your service's HTTP endpoint. For example:

- `http://localhost:4466/my-app/dev/export`
- `https://database.prisma.sh/my-app/prod/export`

One request can download JSON data (in NDF) of at most 10 MB in size. Note that you need to provide your authentication token in the HTTP `Authorization` header of the request!

The endpoint expects a POST request where the body contains JSON with the following contents:

```json
{
  "fileType": "nodes",
  "cursor": {
    "table": 0,
    "row": 0,
    "field": 0,
    "array": 0
  }
}
```

The values in `cursor` describe the offsets in the database from where on data should be exported. Note that each response for an export request will return a new cursor with either of two states:

- Terminated (_not full_): If all the values for `table`, `row`, `field` and `array` are returned as `-1` it means the export has completed.
- Non-terminated (__full_): If any of the values for `table`, `row`, `field` or `array` is different from `-1`, it means the maximum size of 10 MB for this response has been reached. If this happens, you can use the returned `cursor` values as the input for your next export request.

#### Example

Here is an example `curl` command for uploading some JSON data (of NDF type `nodes`):

```sh
curl 'http://localhost:4466/my-app/dev/export' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJPbmxpbmUgSldUIEJ1aWxkZXIiLCJpYXQiOjE1MTM1OTQzMTEsImV4cCI6MTU0NTEzMDMxMSwiYXVkIjasd3d3LmV4YW1wbGUuY29tIiwic3ViIjoianJvY2tldEBleGFtcGxlLmNvbSIsIkdpdmVuTmFtZSI6IkpvaG5ueSIsIlN1cm5hbWUiOiJSb2NrZXQiLCJFbWFpbCI6Impyb2NrZXRAZXhhbXBsZS5jb20iLCJSb2xlIjpbIk1hbmFnZXIiLCJQcm9qZWN0IEFkbWluaXN0cmF0b3IiXX0.L7DwH7vIfTSmuwfxBI82D64DlgoLBLXOwR5iMjZ_7nI' \
-d '{"fileType":"nodes","cursor":{"table":0,"row":0,"field":0,"array":0}}' \
-sSv
```

The generic version for `curl` (using placeholders) would look as follows:

```sh
curl '__SERVICE_ENDPOINT__/export' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer __JWT_AUTH_TOKEN__' \
-d '{"fileType":"__NDF_TYPE__","cursor": {"table":__TABLE__,"row":__ROW__,"field":__FIELD__,"array":__ARRAY__}} }' \
-sSv
```

## Normalized Data Format

The Normalized Data Format (NDF) is used as an _intermediate_ data format for import and export in Prisma services. NDF describes a specific structure for JSON.

### NDF value types

When using the NDF, data is split across three different "value types":

- **Nodes**: Contains data for the _scalar fields_ of nodes
- **Lists**: Contains data for _list fields_ of nodes
- **Relations**: Contains data to connect two nodes via a relation by their _relation fields_

### Structure

The structure for a JSON document in NDF is an object with the following two keys:

- `valueType`: Indicates the value type of the data in the document (this can be either `"nodes"`, `"lists"` or `"relations"`)
- `values`: Contains the actual data (adhering to the value type) as an array

The examples in the following are based on this data model:

```graphql
type User {
  id: String! @unique
  firstName: String!
  lastName: String!
  hobbies: [String!]!
  partner: User
}
```

#### Nodes

In case the `valueType` is `"nodes"`, the structure for the objects inside the `values` array is as follows:

```js
{
  "valueType": "nodes",
  "values": [
    { "_typeName": STRING, "id": STRING, "<scalarField1>": ANY, "<scalarField2>": ANY, ..., "<scalarFieldN>": ANY },
    ...
  ]
}
```

The notations expresses that the fields `_typeName` and `id` are of type string. `_typeName` refers to the name of the SDL type from your data model. The `<scalarFieldX>`-placeholders will be the names of the scalar fields of that SDL type.

For example, the following JSON document can be used to import the scalar values for two `User` nodes:

```json
{
  "valueType": "nodes",
  "values": [
    {"_typeName": "User", "id": "johndoe", "firstName": "John", "lastName": "Doe"},
    {"_typeName": "User", "id": "sarahdoe", "firstName": "Sarah", "lastName": "Doe"}
  ]
}
```

#### Lists

In case the `valueType` is `"lists"`, the structure for the objects inside the `values` array is as follows:

```js
{
  "valueType": "lists",
  "values": [
    { "_typeName": STRING, "id": STRING, "<scalarListField>": [ANY] },
    ...
  ]
}
```

The notations expresses that the fields `_typeName` and `id` are of type string. `_typeName` refers to the name of the SDL type from your data model. The `<scalarListField>`-placeholder is the name of the of the list fields of that SDL type. Note that in contrast to the scalar list fields, each object can only values only for one field.

For example, the following JSON document can be used to import the values for the `hobbies` list field of two `User` nodes:

```json
{
  "valueType": "lists",
  "values": [
    {"_typeName": "User", "id": "johndoe", "hobbies": ["Fishing", "Cooking"]},
    {"_typeName": "User", "id": "sarahdoe", "hobbies": ["Biking", "Coding"]}
  ]
}
```

#### Relations

In case the `valueType` is `"relations"`, the structure for the objects inside the `values` array is as follows:

```js
{
  "valueType": "relations",
  "values": [
    [
      { "_typeName": STRING, "id": STRING, "fieldName": STRING },
      { "_typeName": STRING, "id": STRING, "fieldName": STRING }
    ],
    ...
  ]
}
```

The notations expresses that the fields `_typeName`, `id` and `fieldName` are of type string.

`_typeName` refers to a name of an SDL type from your data model. The `<relationField>`-placeholder is the name of the of the relation field of that SDL type. Since the goal of the relation data is to connect two nodes via a relation, each element inside the `values` array by itself is a pair (written as an array which always contains exactly two elements) rather than a single object as was the case for `"nodes"` and `"lists"`.

For example, the following JSON document can be used to create a relation between two `User` nodes via the `partner` relation field:

```json
{
  "valueType": "relations",
  "values": [
    [
      { "_typeName": "User", "id": "johndoe", "fieldName": "partner" },
      { "_typeName": "User", "id": "sarahdoe", "fieldName": "partner" }
    ]
  ]
}
```
